Heat Map

Word Clouding

Word Clouding of Crime Types: The top number of crimes appeared in 5 years


After reviewing the number trends and density of crime, our group also want to dig further about the primary type of crime with high incidence in Chicago. As a result, we decide to make a word clouding to give greater prominence to crime type that appear more frequently. According to this figure 3, the most frequently occurring type of crime in Chicago from 2017 to 2021 is narcotics, followed by weapons violation, theft, battery, assault, other offense, and criminal trespass etc.

---
title: "Exploratory Analysis Dashboard"
output: 
  flexdashboard::flex_dashboard:
    social: menu
    source: embed
    theme: flatly
---

```{r setup}
library(tidyverse)
library(leaflet)
library(lubridate)
library(ggExtra)
library(plotly)
library(maps)
library(mapdata)
library(ggthemes)
library(mapproj)
library(ggthemes)
library(gganimate)
library(viridis)
library(wordcloud)
library(RColorBrewer)
library(tm)
library(dplyr)
library(scales)
library(gganimate)
theme_set(theme_minimal() + theme(legend.position = "bottom"))
```

```{r}
crime_full=read.csv("/Users/yitian/Desktop/p8105_final_project/data/five_year_data.csv")
```


Trends in Time{.storyboard}
=========================================
### The Number of Crimes Monthly Trends Over 5 Years

```{r, warning=FALSE, message=FALSE}
count_overtime = 
  crime_full %>% 
  group_by(year,month) %>% 
  summarize(cases = n())
count_overtime %>%
  plot_ly(x = ~month, y = ~cases, color=~year, type = "scatter",mode = "lines+markers")  %>% 
  layout(
    title = "Figure 1: The Number of Crime Monthly Trends From 2017 to 2021 in Chicago.",
    xaxis = list(title = "Month"),
    yaxis = list(title = "Total Crime Cases")
  )
```

***
The figure shows that the number of crimes in Chicago change over the most recent six years. The range of the number of crimes per year is between 4629 and 10824 from 2016 to 2021. 

We could see that, the peak number of the crimes over time is 2020 with 10824 cases. Basically, the trend went smooth from 2016 to 2020 and went down after 2020.

Heat Map{.storyboard}
=========================================
### The Monthly Density Trends of Crimes Over 5 Years in Chicago

```{r, message=FALSE}
heatmap_plot = crime_full %>%
  mutate(month = month.abb[as.numeric(month)],
         month = fct_rev(factor(month, levels = month.abb))) %>% 
  group_by(year, month, day) %>% 
  summarise(n_crimes = n()) %>% 
  mutate(day = as.numeric(day))
 
heatmap_plot = heatmap_plot %>% 
  ggplot(aes(x = day, y = month, fill = n_crimes))+
  geom_tile(color = "white",size = 0.1) + 
  scale_fill_viridis(name = "Number of Crimes",option = "C") + 
  facet_grid(.~ year) +
  scale_x_continuous(breaks = c(1,10,20,31)) + 
  theme_minimal(base_size = 8) + 
  labs(title = "Figure 2: The Monthly Density Trends of Crimes from 2017 to 2021 in Chicago", x = "Day", y = "Month") + 
  theme(legend.position = "bottom")+
  theme(plot.title = element_text(size = 14))+
  theme(axis.text.y = element_text(size = 6)) +
  theme(strip.background = element_rect(colour = "white"))+
  theme(plot.title = element_text(hjust = 0))+
  theme(axis.ticks = element_blank())+
  theme(axis.text = element_text(size = 7))+
  theme(legend.title = element_text(size = 8))+
  theme(legend.text = element_text(size = 6))+
  removeGrid()
ggplotly(heatmap_plot + theme(legend.position = "none"))
```

***
This heat map clearly shows that the crime rate has a significant increase in 2020, and the significant increase of crime in Chicago is probably associated with the COVID-19 prevalence. In 2021, the total crime rates is a little bit lower than previous year. In general, there is no significant differences on crime rates with diffeernt months. 

Word Clouding{.storyboard}
=========================================
### Word Clouding of Crime Types: The top number of crimes appeared in 5 years

```{r, warning=FALSE, message=FALSE}
crime = crime_full %>%
   group_by(primary_type) %>% 
   summarise(n_crime = n())
   set.seed(555)
   wordcloud(words = crime$primary_type, freq = crime$n_crime, scale = c(3, .8),min.freq = 1,
      max.words=200, random.order=FALSE, rot.per=0.35, 
      colors=brewer.pal(8, "Dark2"))
title( "Figure 3: Wordclouding of the top Number of Crimes Over 5 Years")
```

***
After reviewing the number trends and density of crime, our group also want to dig further about the primary type of crime with high incidence in Chicago. As a result, we decide to make a word clouding to give greater prominence to crime type that appear more frequently.
According to this figure 3, the most frequently occurring type of crime in Chicago from 2017 to 2021 is narcotics, followed by weapons violation, theft, battery, assault, other offense, and criminal trespass etc.

Trends in Location{.storyboard}
=========================================
### Top 10 Locations With High Criminal Incidence Over 5 Years

```{r, warning=FALSE, message=FALSE}
## clean a new dataset with ranking of numbers and adjusted numbers
crime_loc = crime_full %>%
  group_by(month,location_description) %>% 
  summarise(n = n()) %>% 
  mutate(rank = rank(-n),
         Value_rel = n/n[rank == 1],
         Value_lbl = paste0(" ",n)) %>% 
  filter(rank <= 10)

## form multiple static plots
staticplot = ggplot(crime_loc, aes(rank, group = location_description, 
                fill = as.factor(location_description), color = as.factor(location_description))) +
  geom_tile(aes(y = n / 2,
                height = n,
                width = 0.9), alpha = 0.8, color = NA) +
  geom_text(aes(y = 0, label = paste(location_description, " ")), vjust = 0.2, hjust = 1) +
  geom_text(aes(y = n,label = Value_lbl, hjust = 0)) +
  coord_flip(clip = "off", expand = FALSE) +
  scale_x_reverse() +
  scale_fill_viridis_d(option = "B") +
  scale_color_viridis_d(option = "B") +
  theme_minimal() +
  theme(axis.line = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.position = "none",
        panel.background = element_blank(),
        panel.border = element_blank(),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major.x = element_line( size = .1, color = "grey" ),
        panel.grid.minor.x = element_line( size = .1, color = "grey" ),
        plot.title = element_text(size = 20, hjust = 0.5, face = "bold", vjust = 2),
        plot.subtitle = element_text(size = 14, hjust = 0.5, face = "italic", color = "grey"),
        plot.background = element_blank(),
        plot.margin = margin(2,2, 2, 4, "cm"))

## convert static plots to animated ones
anim_plot = staticplot + 
  transition_states(month, transition_length = 4, state_length = 1) +
  ease_aes('sine-in-out') +
  labs(title = 'Figure 4: Number of Crime in Chicago Over 5 Years : [Month: {closest_state}]',
       subtitle = "Top 10 Location")  

animate(anim_plot, 150, fps = 20, width = 780, height = 560)
```

***
In addition, our group also sorted and ranked the data, and compiled the top 10 locations with high criminal incidence from 2017 to 2021 into a dynamic bar graph to get a more in-depth understanding of the crime situation. Through this motion chart, we can visually observe the names of the crime areas, the number of crime incidents, and their changing trends.It is clear that streets and sidewalks are the locations with high crime rates during the five-year period.